perm filename NOTES.IMP[LSP,JRA]4 blob sn#129968 filedate 1974-11-12 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.SEC(Implications for other languages,,,P87:)
C00003 00003	.SS(On LISP and Semantics)
C00024 ENDMK
C⊗;
.SEC(Implications for other languages,,,P87:)

**development of s-LISP**

compilers for other langauages static and dynamic linking

parsing and scanning and sym tabs

show how easy it is now.

extensions: PLANNER CONNIVER, data bases and pattern matching
?EL1?

simulation languages ==> funargs ==> ACTORS

.SS(On LISP and Semantics)
.turn on "←";
LISP should be the first language learned by Computer Science majors.
As a mathematical language for studies algorithms for data structures, it is
presently without peer.  As you are seeing now, the problems of language implementation and their solutions are describable
quite easily in the implementation of LISP (symbol tables, hashing, garbage
collection, stacks, linked allocation, lists, etc. etc.)  At the theoretical
level, questions of provability of properties of programs are tractable.  As a
programming language, LISP has exceptionally powerful features possessed by few
languages.  In particular the uniform representation of program and data.  LISP
is also quite useful in understanding the ⊗→semantics↔← of programming languages.  The
study of semantics is motivated by the desire to have a tool for describing
the meaning of constructs of a language; in particular, a tool of the power and
clarity of BNF descriptions of the syntax of languages.

The field of programming language semantics is full of ⊗→bull↔← written by people who don't
understand the problem. Here is some more by another.
There are many different schools of thought concerning appropriate means
for describing semantics.
One thought is to describe the meaning of a language in terms of the process
of compilation.  That is, the semantic is specified by some canonical compiler
producing code of some standard machine-independent machine.  The meaning of
a program is the outcome of an interpreter interpreting this machine-independent
code.

The key problem is:  just what is the semantic specification supposed to
do?  If it is truly to help a programmer (be he implementor or applications)
to understand the meaning of a particular constructs, then this proposal is
lacking.  To understand a construct now requires that you read the description 
of a compiler--a non-trivial task, and understand two machines-- the machine-
independent and the real machine.  There is a more fundamental difficulty here.
When you look at a statement of a high-level language you think about the effect
the statement will have on the environment of your program when it executes, you
do not think about the code that gets generated and then think about the
execution of that code.  Thus modeling semantics in terms of a compiler model is
addiing one more level of obfuscation.  A more natural description of the meaning of
constructs is given in terms of the run-time behavior of these constructs.
That's what LISP does.  The %3eval%* function describes the execution sequence of a
representation of an arbitrary LISP expression.  Thus %3eval%* is the semantic
description of LISP.  Now certain timid souls shrink from this resulting
circularity:  the description of the semantics of a language in that language, but 
circularity, like death and taxes, is inevitable.

Attempts have been made to give non-circular interpreter-based
descriptions of semantics for languages other than LISP.  There is the
incredible VDL description of PL/1; and the convoluted description of ALGOL 68
by a Markov algorithm.

To describe meaning in terms of 'self-evident' string manipulations
is misplaced rigor.  If a semantic description is to be anything more than a
sterile game it must be useful to implementors as well as applications
programmers.  If you decide to describe the semantics of language L%41%* in a
simpler language L%42%* then either L%42%* is 'self evident' or you must give a
description of the meaning of L%42%*, etc.  So either you go circular or you end
up with a (useless) 'self-evident' L%4n%*.

There are very good reasons for deciding on direct circularity.  First,
you need only learn one language.  If the semantic specification is given
the source language, then you learn the programming language and the
semantic (meta) language at the same time.  Second, since the evaluator is
written in the language, we can understand the language by understanding
the workings of the single program, %3eval%*; and if we wish to modify the
semantics we need change only one (source language) program.  Thus, if we
wished to add new language constructs to LISP (like the %3prog%* feature) we
need only modify %3eval%* so that it recognizes an occurrence of a (sexpr
representation of a) %3prog%*, add a new (semantic) function to describe the
interpretation of the construct and we're done.


A semantic description should not attempt to explain everything about a language.
You must assume that your reader understands something ... . McCarthy: %2`Nothing
can be explained to a stone'%*.

A semantic description of a language must be natural.  It must match the expressive
power of the language. A semantic description should also model how the constructs
are to be implemented on a reasonable machine.  What is needed is a semantic 
representation which exploits, rather than ignores, the structure of the language.
It can assume a certain level of understanding on the part of the
reader.  It need only explain what needs explaining.

Let's look at syntax for a moment.  A satisfactory description of much of
programming language syntax is standard BNF.  The BNF is a generative, or synthetic
grammar.  That is, rewriting rules specifying how to generate  well formed
strings are given.  A semantic specification on the other hand can be considered
to be a function which takes as input an arbitrary programs and gives as output
a representation of the value of executing the program.  This implies that
our semantic function must have some way of analyzing the structure of the program.


.P75:
In 1962, John McCarthy described the concept of abstract ⊗→analytic syntax↔←. It
is analytic in the sense that it tells how to take programs apart.. how to recognize
and select subparts of programs using predicates and selector functions.  It is abstract
in the sense that it makes no commitment to the external representation of the
constitutients of the program.  We need only be able to recognize  the occurrence
of a desired construct. For example:

←isterm[t]%c≡%*isvar[t]%c∨%*isconst[t]%c∨%*(issum[t]∧isterm[addend[t]]∧isterm[augend[t]])

and the BNF equation:

←<term> ::= <var> | <const> | <term> + <term>.

issum, addend, and augend, don't really care whether the sum is represented as:

x+y, or +[x;y] or %3(PLUS X Y)%* or xy+ .  The different external representations
are reflections of different ⊗→concrete syntax↔←es. The above BNF equations give one.

What is gained?  Since it is reasonable to assume that the semantics is to
operate on some internal representation of the source program, and in fact,
a representation reflecting the structure of the program (commonly known
as a parse tree), we can describe the semantics of a program in terms of a function
which operates on a parse tree using the predicates and selectors of the
analytic syntax. Abstract syntax concerns itself only with those properties
of a program which are of interest to an evaluator.

You may then profitably think of the Meta expression form of LISP as a concrete syntax.
You may think of the M- to S-expression translator as the parser which maps
the external representation onto a parse (or computational) tree. The selectors
and predicates of the analytic syntax are straightforward; recall the BNF for
LISP:

.BEGIN TABIT1(11);GROUP;

<form>\::= <constant>
\::= <variable>
\::=<function>[<arg>; ... <arg>]
\::= [<form> → <form> ... <form> → <form>]
\  ....

.END
Among other things, then, we need to recognize constants (%3isconst%*),
variables (%3isvar%*), conditional expressions (%3iscond%*), and functions 
(%3isfun%*).
We would then need to write a function in some language expressing the values
of these constructs. Since the proposed evaluator is to manipulate parse
trees, and since the domain of LISP functions %2is%* binary trees, it should seem
natural to use LISP itself.  If this is the case, then we must represent the parse
tree as a LISP sexpr and represent the selectors and  recognizers as LISP functions and
predicates.
Perhaps:

.BEGIN SELECT 3;TABIT1(11);GROUP;

isconst[x]\:numberp[x]%c∨%*eq[x;NIL]%c∨%*eq[x;T]%c∨%*eq[car[x];QUOTE]

isvar[x]\:atom[x]%c∧¬%*(eq[x;T]%c∨%*eq[x;NIL]%c∨%*numberp[x])

iscond[x]\:eq[car[x];COND]

.END

So to recapitulate, a concrete syntax for LISP can be conceived of as the normal Mexpr
notation; and abstract syntax for LISP can be formulated in terms of the representation of the
LISP program as a Sexpr.  The the description of the semantics of LISP is simply %3eval%*.

There are really two issues here: %2a%* representation of the analytic syntax of a language,
and a representation in terms of the language itself.  In conjunction, these two
ideas give a natural and very powerful means of specifying semantics.

If this means of semantic specification %2is%* really all that good (does
all good thing, no bad thing, and takes no time) then we should be able to 
specify other languages  similarly. What are the weak points of LISP as
`real' programming language?  Mainly the insistance on binary tree representations
of data.  It is quite clear that many applications could profit from other
data representations.  What we would then like is a language which has richer
data structures than LISP but which is designed to allow LISP-style semantic specification.
That is, we will be able to give an analytic syntactic specification for the language.
We will be able to write an evaluator, albeit more complex than %3eval%*, in the language
itself.  The evaluator will operate on a representation of the program as a legal 
data structure of the language, just as %3eval%* operates on the sexpr translation
of the LISP program.  The concrete syntax will be specified as a set of BNF equations,
and our parser will translate legal strings -- programs-- into parse trees.
In outline, to completely specify a construct in LISP we must have the following:

.BEGIN TABIT1(30);GROUP;

\%21.%*  A concrete production
  **1**\%22.%*  An abstract data type
\%23%*.  A mapping from %21%* to %22%*.
\%24.%*  An evaluator for the abstract type.

.END
The `real' programming language ⊗→EL1↔←, Extensible Language 1, %2has%* in fact been specified in exactly
this manner. We could not begin to describe this language here; rather we will
sketch only a bare outline of the ideas involved. As with LISP, EL1 maps programs
onto data structures. EL1 has richer data types including integers, characters, 
pointers, and structures. Its syntax is described in BNF and a mapping from
well-formed syntactic units to data structures is given. The EL1 evaluator
is written in EL1 and manipulates the data-structure representation of EL1 programs
in a manner totally analogous to the LISP %3eval%* function.

As the name implies, a rationale for EL1 was extensibility. 
An ⊗→extensible language↔← is supposed to supply a base or core language, which has
sufficient handles such that a user can describe new data structures, new operations,
and new control structures. These new  objects are  described in terms of 
combinations of constructs in the ⊗→base language↔←. Extensibility is implicitly
commited to the premiss that the power of high level languages is
primarily notational rather than computational. 
An alternative to extensibility, called ⊗→shell languages↔←, is perhaps
best exemplified by ⊗→PL/1↔←.

`Perhaps the most immediately striking attribute of PL/1 is its bulk'. "bulk"...
that's a polite word for another four-letter expletive.  If nothing else
PL/1 should have the beneficial effect of illustrating the absurdity
of languages which attempt to do all things for all people. ?? PL ... political 
language?? The sheer size of such languages bodes ill for efficient compilation,
maintanence, and learnability.
PL/1 also suffers for the "committee effect" in language design. No great work of art has
ever been designed "by committee"; there is no reason to believe programming language
design should be any different.

We have frequently seen how easy it has been to extend LISP by modifying %3eval%*.
This is particularly easy because we are making modifications at the level of data-structure represntation
of programs. In EL1 we wish to make the extensions at the level of concrete syntax, 
rather than abstract syntax as LISP does. 
We can do this by using a parser which is table-driven,
operating on an modifable set of productions. The parser must be capable
of recognizing occurrences of a set %21. - 4.%*  of **1** above, adding
%21%* to its set of productions, using %22%* and %23%* to effect the
translations, and using %24%* to effect the evaluation of an instance of %21%*.
This in essence is what EL1 does.